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The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 



- Extensions of time may be available under the provisions of 37 CFR 1.136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- if the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 



3) 0 Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayte, 1935 CD. 1 1 , 453 O.G. 213. 

Disposition of Claims 

4) [3 Claim(s) 1-12 and 42-47 is/are pending in the application. 

4a) Of the above claim(s) 13-41 is/are withdrawn from consideration. 

5) ^ Claim(s) 42-47 is/are allowed. 
6® Claim(s) 1-12 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) ^ Claim(s) 13-41 are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10)^3 The drawing(s) filed on 17 March 1999 is/are: a)^ accepted or b)D objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 

Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 
1 1 )□ The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-1 52. 

Priority under 35 U.S.C. § 119 

12)D Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (f). 
a)D All b)D Some * c)D None of: 

1 0 Certified copies of the priority documents have been received. 

2. D Certified copies of the priority documents have been received in Application No. . 

3. Q Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 
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2a)D 



Responsive to communication(s) filed on 17 March 1999 . 

This action is FINAL. 2b)E3 This action is non-final. 
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DETAILED ACTION 

Specification 

1 . Applicant is reminded of the proper language and format for an abstract of the 
disclosure. 

The abstract should be in narrative form and generally limited to a single 
paragraph on a separate sheet within the range of 50 to 1 50 words. It is important that 
the abstract not exceed 150 words in length since the space provided for the abstract 
on the computer tape used by the printer is limited. The form and legal phraseology 
often used in patent claims, such as "means" and "said," should be avoided. The 
abstract should describe the disclosure sufficiently to assist readers in deciding whether 
there is a need for consulting the full patent text for details. 

The language should be clear and concise and should not repeat information 
given in the title. It should avoid using phrases which can be implied, such as, "The 
disclosure concerns," "The disclosure defined by this invention," "The disclosure 
describes," etc. 

Election/Restrictions 

2. Restriction to one of the following inventions is required under 35 U.S.C. 121: 

I. Claims 1-12 and 42-47, drawn to generation of language component 
vocabulary of word forms, classified in class 704, subclasslO. 

II. Claims 13-41, drawn to an enhancement/extension or improvement to the 
speech recognition process by pattern matching, classified in class 704, 
subclasses 231, 251 and 254. 

3. Inventions I and II are related as subcombination and combination, respectively. 
Inventions in this relationship are distinct if it can be shown that (1 ) the combination as 
claimed does not require the particulars of the subcombination as claimed for 
patentability, and (2) that the subcombination has utility by itself or in other 
combinations (MPEP § 806.05(c)). In the instant case, the combination as claimed 
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does not require the particulars of the subcombination as claimed because other 
vocabulary components may be used for the speech recognition system. The 
subcombination has separate utility such as operation of dictionary building for 
language translation. 

4. During a telephone conversation with Frank V. Derosa on March. 9, 2004 a 
provisional election was made with traverse to prosecute the invention I, claims 1-12 
and 42-47. Affirmation of this election must be made by applicant in replying to this 
Office action. Claims 13-31 are withdrawn from further consideration by the examiner, 
37 CFR 1.142(b), as being drawn to a non-elected invention. 

Claim Rejections - 35 USC § 102 

5. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 

form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351 (a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21 (2) 
of such treaty in the English language. 

6. Claims 1-4 are rejected under 35 U.S.C. 102(e) as being anticipated by 
Kanevsky et al. (U.S. Patent No. 6,073,091 filed Aug. 6, 1997). 

As per claim 1 , Kanevsky et al. discloses a method for generating a language 
component vocabulary VC for a speech recognition system having a language 
vocabulary V of a plurality of word forms, the method comprising the steps of: 
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partitioning the language vocabulary V into subsets of word forms based on 
frequencies of occurrence of the respective word forms (C. 3. lines 52, 53); and 

in at least one of said subsets, splitting word forms having frequencies less than 
a threshold to thereby generate word form components (CAIines 58-63). 

As per claim 2, Kanevsky et al. discloses all of the limitations of claim 1, upon 
which claim 2 depends. Kanevsky further discloses: 

the frequencies of the word forms are estimated from a given textual corpus 
(CAIines 13, 14). 

As per claim 3, Kanevsky et al. discloses all of the limitations of claim 1 , upon 
which claim 3 depends. Kanevsky et al. further discloses: 

said portioning step includes the sub-step of numerating the plurality of word 
forms in the language vocabulary V in descending order based on the frequencies 
associated with each of the plurality of word forms (C.4. lines 10-14). 

As per claim 4, Kanevsky et al. discloses all of the limitations of claim 1, upon 
which claim 4 depends. Kanevsky et al. further discloses: 

said partitioning step partitions the language vocabulary V into at least two 
subsets S1 and S2, and said splitting step splits the word forms of subset S2 into 2- 
tuple components including stems and endings, but does not split the word forms of 
subset S1 (CAIines 58-63). 

Claim Rejections - 35 USC § 103 
7. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 
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(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

8. Claims 5 and 7-12 are rejected under 35 U.S.C. 103(a) as being unpatentable 

over Kanevsky et al. (U.S. Patent No. 6,073,091, filed Aug. 6, 1997) in view of 

Kanevsky et al. (U.S. Patent No. 5,835,888 Nov. 10, 1998). 

Kanevsky et al. and Kanevsky et al. are analogous art in that they are both 
involve language modeling for speech recognition. 

As per claim 5, Kanevsky et al. (U.S. Patent No. 6,073,091) discloses all of the 
limitations of claim 4, upon which claim 5 depends. Kanevsky further discloses: 

a splitting step comprising 3-tuple components (C.4. lines 26-28, 30, 31) 

Kanevsky et al. does not disclose: 

further partitioning the language vocabulary V into a third subset S3, with word 
forms therein being split in said splitting step into 3-tuple components including prefixes, 
stems and endings. 

However, as it is well known in the art, Kanevsky et al. (U.S. Patent No. 
5,835,888 Nov. 10, 1998) teaches partitioning the language vocabulary V into a subset 
that includes prefixes, stems, and endings (C.4. lines 18-20). Therefore, at the time of 
the invention, it would have been obvious to combine Kanevsky et al. with Kanevsky et 
al. for the purpose of increasing the component size in a vocabulary set which would 
have increased the recognition of larger words that included a prefix, stem and ending 
while decreasing the size of dictionary needed to match these words. 
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As per claims 7, Kanevsky et al (U.S. Patent No. 6,073,091) discloses all of the 
limitations of claim 1 , upon which claim 7 depends. Kanevsky et al. further discloses: 

said splitting is performed using a fixed vocabulary (C.4. lines 14, 15-N=400,000); 
Kanevsky et al. does not disclose: 

a fixed list of allowable endings, with each word from the fixed vocabulary being 
split into at least a stem and an ending that is an element of the fixed set of endings. 

However, as it is well known in the art, Kanevsky et al. (U.S. Patent No. 
5,835,888) teaches having a fixed list of endings and each word from the vocabulary 
being split into a stem and an ending that is an element of the fixed set of endings 
(C.5.lines 9-13). Therefore, at the time of the invention, it would have been obvious to 
combine Kanevsky et al. with Kanevsky et al. for the purpose of having a limit to the 
amount of vocabulary and stem and ending sets which would increase the processing 
time for a query into which word is to be recognized. 

As per claim 8, Kanevsky et al (U.S. Patent No. 6,073,091 ) and Kanevsky et al. 
(U.S. Patent No. 5,835,888) disclose all of the limitations of claim 7, upon which claim 8 
depends. Kanevsky et al. (U.S. Patent No. 6,073,091) does not disclose: 

the fixed set of allowable endings includes an empty ending; 

However, as it is well known in the art, Kanevsky et al. (U.S. Patent 5,835,888) 
teaches having a list of allowed endings that includes empty endings (C.3.lines 50-54). 
Therefore, at the time of the invention, it would have been obvious to combine 
Kanevsky et al. with Kanevsky et al. for the purpose of having a limit to the amount of 
vocabulary and stem and ending sets and having an empty ending for the case where 
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the stem doesn't have an ending which would increase the processing time for a query 
into which word is to be recognized. 

As per claim 9, Kanevsky et al (U.S. Patent No. 6,073,091) discloses all of the 
limitations of claim 1, upon which claim 9 depends. Kanevsky et al. does not disclose: 

generating and storing a word for to corresponding word form components table; 

However, as it is well known in the art, Kanevsky et al. (U.S. Patent No. 
5,835,888) teaches generating and storing a word form and its stem and endings in a 
table (C. 3. lines 50, 51). Therefore, at the time of the invention, it would have been 
obvious to combine Kanevsky et al. with Kanevsky et al. for the purpose of efficiently 
managing the word forms to word form components for further processing. 

As per claim 10, Kanevsky et al (U.S. Patent No. 6,073,091 ) and Kanevsky et al. 
(U.S. Patent No. 5,835,888) disclose all of the limitations of claim 9, upon which claim 
10 depends. Kanevsky et al. (U.S. Patent No. 6,073,091 ) does not disclose: 

labeling each of the word form components stored in said table to distinguish 
between stems, prefixes and endings; 

However, as it is well known in the art, Kanevsky et al. (U.S. Patent 5,835,888) 
teaches labeling the word components in the. stored table to distinguish between 
components (Fig. 3A-the prefix, root/stem, and end are labeled). Therefore, at the time 
of the invention, it would have been obvious to combine Kanevsky et al. with Kanevsky 
et al. for the purpose of not confusing the tags associated with each segment of the 
word form. 
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As per claim 11, Kanevsky et al (U.S. Patent No. 6,073,091) discloses, all of the 
limitations of claim 1 , upon which claim 1 1 depends. Kanevsky et al. (U.S. Patent No. 
6,073,091 ) further discloses: 

generating a map of said word forms to said word form components (C.4. lines 
30-36, -"...word forms are mapped into corresponding stem and ending numbers."-the 
numbers are interpreted as the components), said map further including each of a 
plurality of no-split words as being associated with itself (C.4. lines 59, 60); 

Kanevsky et al. (U.S. Patent No. 6,073,091) does not disclose: 

filtering a textual corpus using the map to generate a textual component corpus 
containing the non-split word forms and the word form components of the map. 

accumulating the word form components and the non-split word forms generated 
by said filtering step in an n-gram language model; and 

determining counts of n-tuple sets of word form components and word forms to 
estimate n-gram probabilities for the n-gram language. 

However, as it well known in the art, Kanevsky et al (U.S. Patent No. 5,835,888) 
teaches filtering a textual corpus using the map to generate a textual component corpus 
(CAIines 18-20-the sub-vocabularies is interpreted as the component corpus) and 
accumulating the word form components and the non-split word forms generated by 
said filtering step in an n-gram language model (C. 5. lines 48-59) and determining 
counts of n-tuple (C.5.lines 54-56-lists the n-tuple sets) sets of word form components 
and word forms to estimate n-gram probabilities for the n-gram language (C. 5. lines 63- 
65-the counts are n-gram based from the n-tuple sets). Therefore, at the time of the 
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invention it would have been obvious to combine Kanevsky et al. with Kanevsky et al. 
The motivation for doing so would have been to obtain a corpus of corresponding word 
forms to components and generate a way to find the probability of the components 
correctly matching the full word forms without consuming an enormous amount of 
memory space due to an model which only incorporated the full forms of the word, 
which would improve the word recognition without substantially increasing the need for 
data space. 

As per claim 12, Kanevsky et al (U.S. Patent No. 6,073,091) and Kanevsky et al. 
(U.S. Patent No. 5,835,888) disclose all of the limitations of claim 11, upon which claim 
12 depends. Kanevsky et al. (U.S. Patent No. 6,073,091 ) further discloses: 

mapping every word in the corpus into a n-tuple word form component (C.2. lines 
19-21). 

9. Claim 6 is rejected under 35 U.S.C. 103(a) as being unpatentable over Kanevsky 
et al. (U.S. Patent No. 6,073,091) in view of Karaali et al. (U.S. Patent No. 5,930,754 
filed Jun. 13, 1997). 

As per claim 6, Kanevsky et al. (U.S. Patent No. 6,073,091 ) discloses all of the 
limitations of claim 1 , upon which claim 6 depends. Kanevsky et al. does not disclose: 

splitting is performed subject to a constraint in which a word that contains a given 
string of letters is prevented from being split within the string if the string of letters 
corresponds to one phoneme. 

However, as it is well known in the art, Karaali et al. teaches of multiple letters 
corresponding to a single phone, and in the alignment, not aligning a different phone 
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with the multiple letters. Therefore, at the time of the invention, it would have been 
obvious to combine Kanevsky et al. with Karaali et al. The motivation for doing so would 
have been to align corresponding letter pairs with the single phone for the purpose of 
improving the accuracy of speech recognition due to a well known method of aligning 
graphemes to phonemes. 



1 0. Claims 42-47 are allowed. 

1 1 . The following is a statement of reasons for the indication of allowable subject 
matter: 

Regarding claim 42 as understood by the Examiner, the closest prior art of 
Kanevsky et al. (U.S. Patent No. 5,835,888) reads on providing a fixed set of allowable 
endings, including an empty ending (C.3. lines 50-53,C.5.lines 9-13) and providing a 
fixed set of constraints for splitting words into stems (C.5.lines 10-13), randomly splitting 
a word to generate an ending from the fixed list of allowable endings (C. 5. lines 9-16), 
defining and storing a stem set containing the stem generated at said splitting and a 
word set containing the word (C.3. lines 50-53-the table stores the stem and the word), 
determining possible splits for a word to generate stems and endings therefrom, using 
the fixed set of allowable endings and the fixed set of constraints (C. 5. lines 9-13) 

Prior art does not teach nor fairly suggest: 

(c) the combination of initializing a split map of words and the corresponding 
stems and endings by setting a variable t to a predetermined value, and selecting a first 
word from the fixed vocabulary, (f) determining whether t is less than the size of the 



Allowable Subject Matter 
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vocabulary, obtaining a new word from the vocabulary V, when t is less than the size of 
the vocabulary, (h) determining possible splits for the new word to generate stems and 
endings therefrom, using the fixed set of allowable endings and the fixed set of 
constraints, (i) determining whether there is a split for the new word that generates a 
previously stored stem of the stem set, Q) splitting the current word into the previously 
stored stem and an ending of the set of allowable endings, when there is a split for the 
new word that generates the previously stored stem of the stem set, (k) determining 
whether another previously stored stem in the stem set can be replaced by a new stem 
generated, when there is no split for the current word that generates the previously 
stored stem of the stem set, (I) redefining the stem set and the split map to include the 
new stem generated at (h) in place of the other previously stored stem, when the other 
previously stored stem can be replaced by the new stem generated at step (h), (m) 
redefining the stem set to include any new stem into which the current word may be split 
and extending the split map to include the current word by splitting the new word into 
the new st3em, when the other previously stored stem in the stem set cannot be 
replaced by the new stem generated at step (h), and (n) incrementing t and returning to 
step (f) if t is less than the size of the vocabulary V. 

Claims 43-47 are allowable as they further limit their parent claims. 
12. As allowable subject matter has been indicated, applicant's reply must either 
comply with all formal requirements or specifically traverse each requirement not 
complied with. See 37 CFR 1 . 1 1 1 (b) and MPEP § 707.07(a). 
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13. Any comments considered necessary by applicant must be submitted no later 
than the payment of the issue fee and, to avoid processing delays, should preferably 
accompany the issue fee. Such submissions should be clearly labeled "Comments on 
Statement of Reasons for Allowance." 

Conclusion 

14. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Renz (U.S. Patent No. 6,038,527 filed Mar. 14, 1997) teaches splitting 
word forms into stems/descriptors for the classification of text. 
Ballard et al. (U.S. Patent No. 5,377,281 filed Dec. 27, 1994) teaches 
using tri-gram probabilities for the recognition of character strings. 
Weeks (U.S. Patent No. 6,338,057 filed Dec. 7. 1998) teaches storing 
prefixes and endings in association with the stem. 
Decker et al. (U.S. Patent No. 5,229,936 filed Jul. 20, 1993) teaches 
labeling and storing the stems and sequences of character strings. 
King et al. (U. S. Patent No. 5,953,541 filed Jan. 24, 1997) teaches 
reducing the required vocabulary space by only storing one stem that will 
match several different words and related frequency information. 
Lau et al. (U.S. Patent No. 5,467,425 Nov. 14, 1995) teaches separating a 
textual corpus into several classes based on frequency of words and 
forming n-grams for the entire vocabulary set. 
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Ushioda (U.S. Patent No. 5,835,893 Nov. 10, 1998) teaches detecting a 



frequency of words different from one another and arranging the plurality 



of words in descending order of appearance frequency, and separating 



the words into appropriate classes. 



1 5. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Lamont M Spooner whose telephone number is 
703/305-8661 . The examiner can normally be reached on 8:00 AM - 5:00 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Smits can be reached on 703/306-301 1 . The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



Ims 

03/12/04 




